Data from: Lopez-Yrigoyen M, Yang CT, Fidanza A, Cassetta L et al. Genetic programming of macrophages generates an in vitro model for the human erythroid island niche. Nat Commun 2019 Feb 20;10(1):881.
# Load Biobase.
if (! require(Biobase, quietly=TRUE)) {
if (! exists("biocLite")) {
source("https://bioconductor.org/biocLite.R")
}
biocLite("Biobase")
library(Biobase)
}
# Load GEOquery.
if (! require(GEOquery, quietly=TRUE)) {
if (! exists("biocLite")) {
source("https://bioconductor.org/biocLite.R")
}
biocLite("GEOquery")
library(GEOquery)
}
library(knitr)
library(edgeR)
# GSE125150 <- getGEO("GSE125150", GSEMatrix = FALSE)
GSE125150 <- getGEO(filename = "~/projects/GSE125150_family.soft", GSEMatrix = FALSE)
## Reading file....
## Parsing....
## Found 9 entities...
## GPL20301 (1 of 10 entities)
## GSM3564285 (2 of 10 entities)
## GSM3564286 (3 of 10 entities)
## GSM3564287 (4 of 10 entities)
## GSM3564288 (5 of 10 entities)
## GSM3564289 (6 of 10 entities)
## GSM3564290 (7 of 10 entities)
## GSM3564291 (8 of 10 entities)
## GSM3564292 (9 of 10 entities)
kable(data.frame(head(Meta(GSE125150))), format = "html")
| contact_address | contact_city | contact_country | contact_email | contact_institute | contact_laboratory |
|---|---|---|---|---|---|
| 5 Little France Dr | Edinburgh | United Kingdom | s1419027@sms.ed.ac.uk | Centre for Regenerative Medicine | Forrester |
gpl <- names(GPLList(GSE125150))[1]
# gpl_info <- Meta(getGEO(gpl))
| gene_id | source | gene_version | gene_name | gene_biotype | phase | K2MT_A | K2MT_B | K2MT_C | |
|---|---|---|---|---|---|---|---|---|---|
| ENSG00000000003 | ENSG00000000003 | ensembl | 14 | TSPAN6 | protein_coding | NA | 557 | 384 | 416 |
| ENSG00000000005 | ENSG00000000005 | ensembl_havana | 5 | TNMD | protein_coding | NA | 0 | 0 | 0 |
| ENSG00000000419 | ENSG00000000419 | ensembl_havana | 12 | DPM1 | protein_coding | NA | 1052 | 1082 | 1093 |
| ENSG00000000457 | ENSG00000000457 | ensembl_havana | 13 | SCYL3 | protein_coding | NA | 483 | 543 | 627 |
| ENSG00000000460 | ENSG00000000460 | havana | 16 | C1orf112 | protein_coding | NA | 278 | 295 | 398 |
| ENSG00000000938 | ENSG00000000938 | ensembl_havana | 12 | FGR | protein_coding | NA | 7019 | 5414 | 5931 |
| ENSG00000000971 | ENSG00000000971 | ensembl_havana | 15 | CFH | protein_coding | NA | 91 | 52 | 243 |
| ENSG00000001036 | ENSG00000001036 | ensembl_havana | 13 | FUCA2 | protein_coding | NA | 5648 | 4683 | 6030 |
| ENSG00000001084 | ENSG00000001084 | havana | 10 | GCLC | protein_coding | NA | 20047 | 18149 | 23182 |
| ENSG00000001167 | ENSG00000001167 | ensembl_havana | 14 | NFYA | protein_coding | NA | 2588 | 2366 | 2766 |
| ENSG00000001460 | ENSG00000001460 | ensembl_havana | 17 | STPG1 | protein_coding | NA | 137 | 76 | 83 |
| ENSG00000001461 | ENSG00000001461 | ensembl_havana | 16 | NIPAL3 | protein_coding | NA | 1025 | 562 | 730 |
| ENSG00000001497 | ENSG00000001497 | ensembl_havana | 16 | LAS1L | protein_coding | NA | 1155 | 974 | 1178 |
| ENSG00000001561 | ENSG00000001561 | ensembl_havana | 6 | ENPP4 | protein_coding | NA | 122 | 97 | 128 |
| ENSG00000001617 | ENSG00000001617 | havana | 11 | SEMA3F | protein_coding | NA | 9 | 4 | 30 |
## named integer(0)
The data provided was thorough enough to have HGNC symbols for each gene, so there was no need for identifier mapping.
## [1] 60675 15
Started with 60675 genes. Filtered down to 13872 genes.
# Convert back to data.frame so that we can associate gene names with the counts again, because
# matrices cannot hold strings and numerics together.
a <- as.data.frame(normalized_counts)
b <- c("gene_id", colnames(a))
normalized_counts <- merge(expr_filtered[c(2, 5)], cbind(rownames(a), a), by.x="gene_id", by.y="rownames(a)")
write.table(normalized_counts, "~/GSE125150_normalized_counts.txt", sep = "\t")
In the previous assignment, data was retrieved from GEO with the id GSE125150. This data is explores how KLF-1 expression affects macrophage function during the maturation of erythrocyte, in humans. After the raw gene count data underwent conversion to cpm, any uninformative or weakly expressed genes were omitted. Then, the cleaned data was normalized using log2(cpm) and further analysis with MDS plot and mean-variance relationship showed that the samples clustered based on treatment (KLF-1+ vs. KLF-1-) and that the data followed a negative binomial distribution. After cleaning and normalization, the dataset was reduced to 13872 genes starting from 60675.
# Load Biobase.
if (! require(Biobase, quietly=TRUE)) {
if (! exists("biocLite")) {
source("https://bioconductor.org/biocLite.R")
}
biocLite("Biobase")
library(Biobase)
}
# Load GEOquery.
if (! require(GEOquery, quietly=TRUE)) {
if (! exists("biocLite")) {
source("https://bioconductor.org/biocLite.R")
}
biocLite("GEOquery")
library(GEOquery)
}
library(knitr)
library(edgeR)
normalized_count_data <- read.table(file=file.path("~/GSE125150_normalized_counts.txt"))
# Check if data imported correctly:
normalized_count_data[1:5,1:5]
# Converting to matrix, because we have a data.frame:
heatmap_matrix <- normalized_count_data[,3:ncol(normalized_count_data)]
rownames(heatmap_matrix) <- normalized_count_data$gene_id
colnames(heatmap_matrix) <- colnames(normalized_count_data[,3:ncol(normalized_count_data)])
heatmap_matrix
library(ComplexHeatmap)
library(circlize)
# Create a heat map
if(min(heatmap_matrix) == 0){
heatmap_col = colorRamp2(c( 0, max(heatmap_matrix)), c( "white", "red"))
} else {
heatmap_col = colorRamp2(c(min(heatmap_matrix), 0, max(heatmap_matrix)), c("blue", "white", "red"))
}
current_heatmap <- Heatmap(as.matrix(heatmap_matrix),
show_row_dend = TRUE,
show_column_dend = TRUE,
col=heatmap_col,
show_column_names = TRUE,
show_row_names = FALSE,
show_heatmap_legend = TRUE
)
# Apply row-normalization
# Subtract the mean from each value and divide by the standard dev of the row to get row-normalization
# (scale() function does this for us).
heatmap_matrix <- t(scale(t(heatmap_matrix)))
if(min(heatmap_matrix) == 0){
heatmap_col = colorRamp2(c( 0, max(heatmap_matrix)), c( "white", "red"))
} else {
heatmap_col = colorRamp2(c(min(heatmap_matrix), 0, max(heatmap_matrix)), c("blue", "white", "red"))
}
current_heatmap <- Heatmap(as.matrix(heatmap_matrix),
show_row_dend = TRUE,
show_column_dend = TRUE,
col=heatmap_col,
show_column_names = TRUE,
show_row_names = FALSE,
show_heatmap_legend = TRUE
)
# Roughly, our data clusters based on treatment (KLF-1+ or KLF-1-), also shown by the MDS plot in A1.
plotMDS(heatmap_matrix, labels = rownames(samples), col = c("darkgreen", "blue")[factor(samples$Condition)])
# Define groups:
samples <- data.frame(
lapply(colnames(normalized_count_data)[3:10],
FUN=function(x){
unlist(strsplit(x, split="_"))[c(2,1)]}))
colnames(samples) <- colnames(normalized_count_data)[3:10]
rownames(samples) <- c("sample", "treatment")
samples <- data.frame(t(samples))
samples
# Create data matrix
model_design <- model.matrix(~ samples$treatment)
kable(model_design, type="html")
| (Intercept) | samples$treatmentK2PT |
|---|---|
| 1 | 0 |
| 1 | 0 |
| 1 | 0 |
| 1 | 0 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
expression_matrix <- as.matrix(normalized_count_data[,3:10])
rownames(expression_matrix) <- normalized_count_data$gene_id
colnames(expression_matrix) <- colnames(normalized_count_data)[3:10]
minimal_set <- ExpressionSet(assayData=expression_matrix)
# Fit the data to our linear model
fit <- lmFit(minimal_set, model_design)
# Apply empirical Bayes to compute differential expressions for the model
fit2 <- eBayes(fit,trend=TRUE)
topfit <- topTable(fit2,
coef=ncol(model_design),
adjust.method = "BH",
number = nrow(expression_matrix))
# Merge gene names to topfit table
output_hits <- merge(normalized_count_data[,1:2],
topfit,
by.y=0,by.x=1,
all.y=TRUE)
# Sort by pvalue
output_hits <- output_hits[order(output_hits$P.Value),]
output_hits
# Check the results
kable(output_hits[1:10,],type="html")
| gen | e_id gen | e_name | logFC | AveExpr | t P. | Value ad | j.P.Val | B |
|---|---|---|---|---|---|---|---|---|
| 5581 | ENSG00000136999 | NOV | 34.887889 | 21.792339 | 26.50301 | 0 | 2.35e-05 | 1.839216 |
| 6348 | ENSG00000143537 | ADAM15 | 200.047348 | 182.182205 | 26.17249 | 0 | 2.35e-05 | 1.834049 |
| 824 | ENSG00000068078 | FGFR3 | 6.509106 | 4.371374 | 24.02113 | 0 | 2.35e-05 | 1.795317 |
| 5062 | ENSG00000133101 | CCNA1 | -16.752138 | 10.358061 | -23.54767 | 0 | 2.35e-05 | 1.785418 |
| 5416 | ENSG00000135931 | ARMC9 | -109.559602 | 91.409905 | -23.49629 | 0 | 2.35e-05 | 1.784309 |
| 2556 | ENSG00000106123 | EPHB6 | 38.446298 | 25.271772 | 23.13532 | 0 | 2.35e-05 | 1.776325 |
| 2957 | ENSG00000110651 | CD81 | -320.335578 | 811.888969 | -20.13071 | 0 | 6.12e-05 | 1.693589 |
| 8878 | ENSG00000167779 | IGFBP6 | 24.459781 | 15.279415 | 19.77006 | 0 | 6.12e-05 | 1.681245 |
| 1515 | ENSG00000090776 | EFNB1 | 86.553752 | 84.079504 | 19.51893 | 0 | 6.12e-05 | 1.672275 |
| 4625 | ENSG00000128918 | ALDH1A2 | -169.172552 | 150.016291 | -19.11615 | 0 | 6.51e-05 | 1.657203 |
length(which(output_hits$P.Value < 0.01))
## [1] 2841
length(which(output_hits$adj.P.Val < 0.01))
## [1] 1337
For Multiple Hypothesis Testing, I used the BH method. According to the documentation and literature, BH method controls for the false discovery rate (a less stringent condition than family-wise error rate) and therefore is a more powerful method than those that control for family-wise error rate. My dataset allows many genes to pass even with the stringent condition (with BH), so I decided to use BH. Before correction, 2841 genes passed a cutoff of 0.01, and after correction with the BH method, 1337 genes passed. With a cutoff of 0.05, there were 2859 (after correction) and I considered this as excessive, so I decided to make the cutoff more strict. I could have made the cutoff higher to 0.001; however, I was afraid I would disregard interesting genes, so I kept the cutoff at 0.01 as it is a generally accepted p-value for high significance.
top_hits <- output_hits$gene_id[output_hits$adj.P.Val<0.01]
top_hits
heatmap_matrix_tophits <- t(
scale(t(heatmap_matrix[
which(rownames(heatmap_matrix) %in% top_hits),])))
if(min(heatmap_matrix_tophits) == 0){
heatmap_col = colorRamp2(c( 0, max(heatmap_matrix_tophits)),
c( "white", "red"))
} else {
heatmap_col = colorRamp2(c(min(heatmap_matrix_tophits), 0, max(heatmap_matrix_tophits)), c("blue", "white", "red"))
}
current_heatmap <- Heatmap(as.matrix(heatmap_matrix_tophits),
cluster_rows = TRUE,
cluster_columns = TRUE,
show_row_dend = TRUE,
show_column_dend = TRUE,
col=heatmap_col,
show_column_names = TRUE,
show_row_names = FALSE,
show_heatmap_legend = TRUE,
)
current_heatmap
Based on the heatmap, the conditions (KLF-1+ vs. KLF-1-) cluster together. All the samples in KLF-1+ (K2PT) share similar differential expression profiles and all the samples in KLF-1- (K2MT) share similar differential expression profiles.
output_hits[,"rank"] <- -log(output_hits$adj.P.Val,base =10) * sign(output_hits$logFC)
output_hits <- output_hits[order(output_hits$rank),]
upregulated_genes <- output_hits$gene_name[
which(output_hits$adj.P.Val < 0.01
& output_hits$logFC > 0)]
downregulated_genes <- output_hits$gene_name[
which(output_hits$adj.P.Val < 0.01
& output_hits$logFC < 0)]
upregulated_genes[1:10]
## [1] B4GALT6 MAP7D1 AKAP12 CLTCL1 ABLIM1
## [6] IL10 NDUFAF5 RP11-550P17.5 TRGV6 DNM1L
## 13869 Levels: A1BG-AS1 A2M A2M-AS1 AAAS AACS AADAC AADAT AAED1 AAGAB ... ZZZ3
downregulated_genes[1:10]
## [1] CCNA1 ARMC9 CD81 ALDH1A2 ID2 MAN1A1 PLP2 CTNNB1
## [9] H6PD HS3ST1
## 13869 Levels: A1BG-AS1 A2M A2M-AS1 AAAS AACS AADAC AADAT AAED1 AAGAB ... ZZZ3
write.table(x=upregulated_genes,
file=file.path("KLF1_upregulated_genes.txt"),sep = "\t",
row.names = FALSE,col.names = FALSE,quote = FALSE)
write.table(x=downregulated_genes,
file=file.path("KLF1_downregulated_genes.txt"),sep = "\t",
row.names = FALSE,col.names = FALSE,quote = FALSE)
The extracted genes were put through g:Profiler for ORA. g:Profiler was used because of its easy manipulation. The following options were used. Significance threshold: Benjamini-Hochberg (Again, because of its powerfulness due to its stringence.) User threshold: 0.05 Annotation sources: GO molecular function, GO biological process, Reactome, WikiPathways (These sources were used because we want to know the mechanism of action behind the effect of KLF1+ macrophages on the maturation of erythrocytes, and what kinds of pathways may be involved for this maturation to occur. Other sources, like GO cellular component, were not relevant because we are interested in a broader picture on the pathways and processes)
Below are the images of related terms for the upregulated genes:
knitr::include_graphics("upregulated_genes_1.png")
knitr::include_graphics("upregulated_genes_2.png")
knitr::include_graphics("upregulated_genes_3.png")
knitr::include_graphics("upregulated_genes_4.png")
knitr::include_graphics("downregulated_genes_1.png")
knitr::include_graphics("downregulated_genes_2.png")
knitr::include_graphics("downregulated_genes_3.png")
The results of this analysis broadly supports the original paper’s conclusion, in that KLF-1 induction does lead to differentiation to macrophages that are involved in erythrocyte maturation. The result of g:Profiler shows that upregulated genes are involved in immune system processes, as well as cell-cell adhesion and communication. Also, protein kinase activity seems to be upregulated, and this makes sense because kinases are often related with signalling pathways and activations. Interestingly, these genes are also involved in neuronal processes which could potentially unveil interesting relationships between immune cells that are resident in umbilical cords vs. near neurons. Furthermore, the downregulated genes appear to be involved in kinase inhibition, which corresponds to the upregulated genes. One interesting process that is shown to be downregulated in KLF-1+ macrophages is the regulation of CD4 T cells. This could possibly be due to the fact that these macrophages are obligates to helping the maturation of erythrocytes.
# Filter to only significant genes
output_hits_filtered <- (subset(output_hits,
output_hits$adj.P.Val < 0.01))
# Flip order because rank is in ascending order
output_hits_desc <- output_hits_filtered[nrow(output_hits_filtered):1, ]
# Write only the second(gene name) and ninth(rank) columns.
write.table(x=output_hits_desc[c(2,9)],
file=file.path("KLF1_genelist.rnk"),sep = "\t",
row.names = FALSE,col.names = FALSE,quote = FALSE)
# Top 10 genes and ranks.
kable((output_hits_desc[c(2,9)])[1:10,], format = "html")
| gene_name | rank | |
|---|---|---|
| 2556 | EPHB6 | 4.628656 |
| 824 | FGFR3 | 4.628656 |
| 6348 | ADAM15 | 4.628656 |
| 5581 | NOV | 4.628656 |
| 1515 | EFNB1 | 4.213202 |
| 8878 | IGFBP6 | 4.213202 |
| 9648 | C1QB | 3.898630 |
| 2806 | ABI3 | 3.898630 |
| 11590 | GRK6 | 3.883293 |
| 258 | DNASE1L1 | 3.883293 |
GSEA 4.0.3 was used. Geneset was retrieved from http://download.baderlab.org/EM_Genesets/current_release/Human/symbol/. Because we want the latest GO biological processes and do not want anything inferred from IEA, Human_GOBP_AllPathways_no_GO_iea_April_01_2020_symbol.gmt was used. The following parameters were used: - No collapse - Max size 200 - Min size 15 - Everything else as default.
Below are the top 10 enriched gene sets from upregulated and downregulated genes, respectively.
knitr::include_graphics("GSEA_top10_pos.png")
knitr::include_graphics("GSEA_top10_neg.png")
Based on these lists, the upregulated genes are enriched in neuron projection guidance, which suggests that the genes are involved in neuron development. Furthermore, other top gene sets are also involved in CNS development, such as axon development. These are known results from previous research, for example by Moore, Apara and Goldberg (2011), where KLF genes are transcription factors involved in neuron growth and axon regeneration. Furthermore, Bieker, Ouyang and Chen (1998) also state that among the KLF genes, post-translational phosphorylation of KLF1 allows it to bind beta-globin promotor in erythroid cell lines that do not express beta-globin yet. To support this, the 15th gene set (not shown in figure above) related to positive regulation of tyrosine-peptidyl phosphorylation. Also, a few of the top gene sets appear to be involved in immune cells and function. The downregulated genes are enriched in regulation of protein kinase activity. We cannot conclude if this is allowing phosphorylation of KLF-1 since the geneset term is not specific enough in stating whether it positively or negatively regulates kinase activity.
The stats associated with neuron projection guidance are: - ES = 0.4 - NES = 2.48 - FDR = 0.065 - 31 genes in leading edge - Top gene: EFNB1; Receptor tyrosine kinase involved in migration, repulsion and adhesion during neuronal, vascular and epithelial development.
The stats associated with regulation of protein kinase activity are: - ES = -0.2 - NES = -1.88 - FDR = 1.000 - 59 genes in leading edge - Top gene: MAP3K6 in the MAP kinase signal transduction
Compared to the analysis in A2, the gene set results are more specific to neuronal and axonal growth, rather than general cell binding activity. I think these results can be compared because regardless of thresholded or nonthresholded, the scores observed among the genes prior to gene set enrichment analysis are very good, and therefore the thresholds do not affect the gene list much. With that being said, the thresholds are arbitrary while GSEA uses a non-thresholded approach, being fair by including all genes. So, although I say that they can be compared, I think the nonthresholded analysis would yield a more robust result.
For the following steps, Cytoscape 3.7.2 was used, along with its Apps.
The GSEA results were imported into Cytoscape using the Enrichment Map App. The FDR q-value cutoff was set to 0.1 (initially 0.01 but did not allow any to pass). Even then, this only allowed < 30 terms so instead a p-value cutoff of 0.05 was used. Below is the initial enrichment map.
knitr::include_graphics("initial_enrichment_map_final.png")
AutoAnnotate App was used to annotate the EM, using Community Cluster and Word Cloud: Adjacent Words. The default MCL cluster did not work and other methods were not as informative as Community Cluster. Below is the annotated enrichment map.
knitr::include_graphics("annotated_enrichment_map.png")
The clusters were filtered down to only those containing >3 terms. Below is the filtered annotated enrichment map, with a legend.
knitr::include_graphics("enrichment_map_final_legends_added.png")
The clusters were collapsed to create a themed enrichment map. The major themes highlighted here are: - neutrophil activation, inflammatory response in presence of oxygen, cell morphogenesis and differentiation (likely related to immune cells) - regulation of phosphorylation - ion transport and homeostasis - cell adhesion - matrisome - apoptosis These themes mildly fit with the model, where KLF-1 in macrophage is contributing to creating an erythroblastic island niche. Although the analysis is not suggesting anything as specific as the model, it does suggest how KLF-1 is involved in cell morphogenesis, likely of immune cells, since the cluster is connected to the inflammatory response cluster. Based on this EM, an interesting and novel pathway seen is neutrophil activation. As far as I have researched for papers, there were no direct KLF-1 activity associated with neutrophil activation.
Do the enrichment results support conclusions or mechanism discussed in the original paper? How do these results differ from the results you got from Assignment #2 thresholded methods?
Partially. This result suggests how KLF-1 could be involved in cell morphogenesis of immune cells, but does not go as far to suggest which immune cells it plays a role in, aside from neutrophils (which is not the conclusion discussed in the paper). Based on the specific GO BP terms in the EM, the analysis more so supports that KLF-1 plays a role in cell morphogenesis of neuronal cells, instead of erythrocytes.
Moreover, these results were much more specific compared to Assignment 2, where results were not specific for immune cells and more geared towards neurons. The rest were quite similar, for example, capturing terms associated with protein kinase activity and cell adhesion.
Can you find evidence, i.e. publications, to support some of the results that you see. How does this evidence support your result?
Yes. As stated previously in this report, Moore, Apara and Goldberg (2011) have concluded that the KLF family of transcription factors are involved in neurite outgrowth and axon regeneration, which are directly related to what we have observed in our enrichment map, under the cell morphogenesis theme. Furthermore, Bieker, Ouyang and Chen (1998) have shown that KLF-1, upon phosphorylation, allows transcription of beta-globin, in erythroid cell lines that have not yet expressed beta-globin. The enrichment map has a ‘regulation of phosphorylation’ theme, with many terms relating to positive regulation associated with KLF-1+ cells. Therefore, results from Bieker, Ouyang and Chen support the results seen here.
After enrichment analysis, my results seemed quite close to those that were achieved by the original publication. One thing that lacked in my results was specificity for erythrocytes, so I decided to explore this further to see if I could possibly get closer to the results from the original publication. Therefore, I selected the erythrocyte differentiation pathway signature set from MSigDB. By using this for post analysis, I figured that I could arrive at a more confident conclusion on whether my results are in line with the original publication.
The gene set I have selected is BIOCARTA_ERYTH_PATHWAY, from MSigDB (https://www.gsea-msigdb.org/gsea/msigdb/geneset_page.jsp?geneSetName=BIOCARTA_ERYTH_PATHWAY&keywords=erythrocyte).
I imported this into the main enrichment map that I generated and below is the post-analysis enrichment map. I’ve set the parameters to Mann-Whitney (One-sided Greater) because I want the erythrocyte differentiation genes to be expressed at the top end genes in my dataset. Unfortunately, the statistical values were not significant (although almost significant), so the cutoff was set to 0.15 instead of the ideal 0.05.
knitr::include_graphics("postanalysis_eryth.png")
This post analysis shows that indeed many of the genes expressed in KLF-1+ macrophages are related to erythrocyte differentiation, which supports the findings made by the original study. Therefore, although not confident due to insignificant statistical values, we can say that the genes are likely involved in inflammatory responses and cell morphogenesis for erythrocytic cell lines, and neutrophil activation could play an important role in differentiation of erythrocytes. * The original network was subsetted to include only the nodes from clusters with >3 terms, in the post analysis.
Ouyang L, Chen X, Bieker JJ. Regulation of erythroid Krüppel-like factor (EKLF) transcriptional activity by phosphorylation of a protein kinase casein kinase II site within its interaction domain. J Biol Chem. 1998 Sep 4;273(36):23019-25.
Isserlin et al. Enrichment Map – a Cytoscape app to visualize and explore OMICs pathway enrichment results. F1000Res. 2014; 3:141.
Kucera et al. AutoAnnotate: A Cytoscape app for summarizing networks with semantic annotations. F1000Res. 2016 July 15;5:1717.
Liberzon et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics. 2011 June 15;27(12):1739-40.
Lopez-Yrigoyen M, Yang CT, Fidanza A, Cassetta L et al. Genetic programming of macrophages generates an in vitro model for the human erythroid island niche. Nat Commun 2019 Feb 20;10(1):881.
Montojo et al. GeneMANIA Cytoscape plugin: fast gene function predictions on the desktop. Bioinformatics. 2010 Nov 15;26(22):2927-2928.
Moore DL, Apara A, and Goldberg JL. Krüppel-like transcription factors in the nervous system: novel players in neurite outgrowth and axon regeneration. Mol Cell Neurosci. 2011 Aug;47(4):233-43.
Morris et al. clusterMaker: a multi-algorithm clustering plugin for Cytoscape. BMC Bioinformatics. 2011 November 9;12(436).
Oesper et al. WordCloud: a Cytoscape plugin to create a visual semantic summary of networks. Source Code Biol Med. 2011 April 7;6:7.
Shannon et al. Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks. Genome Res. 2003 Nov;13(11):2498-504.
Subramanian et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005 Oct 25;102(43):15545-50.
Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK (2015). “limma powers differential expression analyses for RNA-sequencing and microarray studies.” Nucleic Acids Research, 43(7), e47. doi: 10.1093/nar/gkv007.
Robinson MD, McCarthy DJ, Smyth GK (2010). “edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.” Bioinformatics, 26(1), 139-140. doi: 10.1093/bioinformatics/btp616.
Uku Raudvere, Liis Kolberg, Ivan Kuzmin, Tambet Arak, Priit Adler, Hedi Peterson, Jaak Vilo: g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update) Nucleic Acids Research 2019; doi:10.1093/nar/gkz369